Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Turn evt def instantiable #3293

Closed
wants to merge 2 commits into from
Closed

Turn evt def instantiable #3293

wants to merge 2 commits into from

Conversation

rafaeldtinoco
Copy link
Contributor

@rafaeldtinoco rafaeldtinoco commented Jun 30, 2023

commit f46fa02 (HEAD -> turn-evt-def-instantiable, rafaeldtinoco/turn-evt-def-instantiable)
Author: Rafael David Tinoco rafaeldtinoco@gmail.com
Date: Thu Jun 29 22:52:53 2023

chore(pkg/events): add unit tests for event group and dependencies

Make sure all types are thread-safe (for dynamic changes from API server).

commit 6b1a9be
Author: Rafael David Tinoco rafaeldtinoco@gmail.com
Date: Thu Jun 29 22:53:05 2023

feature(pkg/events): turn events instantiable

- Will allow dynamic reconfiguration during runtime (API server).

- Will allow the Tracee Extensions feature:

  - Map Group
  - Probe Group (PR: #3223)
  - Functions
  - Event Group (this PR)
  - Signature(s)

related: #3170

@rafaeldtinoco
Copy link
Contributor Author

Still needs some changes (like GetParams() being called multiple times for the same slice, forgot to change) but I think it can go through a first review round. Thanks.

Copy link
Collaborator

@NDStrahilevitz NDStrahilevitz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stopping the review at this point since I believe all my next comments would be similar.
Main point I have is to consider if we actually need a lock on a lot of the fields we added them for, instead of just making them immutable by design.

@@ -24,6 +24,14 @@ import (
)

func GetTraceeRunner(c *cobra.Command, version string) (cmd.Runner, error) {
// Initialize event definitions

// events.Definitions = events.NewEventGroup()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leftover?


type Dependencies struct {
events map[ID]struct{} // map[eventID]struct{}
probes map[probes.Handle]*Probe // map[handle]*ProbeDependency
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it actually necessary for the map values to be pointers? Would map[probes.Handle]Probe not work as well?

nit: ProbeSettings as a name, just since the Probe is already a type of the probes package, this struct is more about loading settings.

Copy link
Contributor Author

@rafaeldtinoco rafaeldtinoco Jul 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About being a pointer: the "probe dependency" has a "required/not_required" bool (with set/unset/read methods) that needs to be atomic (and singular) among all execution threads.

About the need for an instance: is "required/not_required" something we would change runtime? I believe it would when adding/removing policies).

About the naming, unfortunately there is a similar problem to all "dep_XXX" types:

Event: is actually EventDefinition and not "an event" (confuses with Tracee Event). Probe is ProbeDependency. KSymbol is a KSymbolDependency. Capabilities are CapabilitiesNeeded.

So, at least for now, I kept:

  1. event_group
  2. event
  3. type_XXX (for types used in dependencies)
  4. dep_YYY (for methods of types used in dependencies)

But I'm opened to suggestions (so we create a pattern for all of them together, not a single one).

Update: check #3293 (comment)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About the need for an instance: is "required/not_required" something we would change runtime? I believe it would when adding/removing policies).

I don't think there is such a need. When writing the event, the author decides if a probe is required or not for the event to function correctly (or at least partially work). This information about the event should be something that can be changed in runtime

eventsLock *sync.RWMutex
probesLock *sync.RWMutex
kSymbolsLock *sync.RWMutex
tailCallsLock *sync.RWMutex
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC, tails calls are the only dependency we modify in runtime (which I used as a bit of a hack when we did syscall optimization a while back). I think it's not very ideal in the first place that we can modify dependencies.

IMO instead of adding locks, the best solution is to just leave what shouldn't be mutable as immutable. I think for all the above (and we should later do the same for tails calls), there's no need to lock, the solution should be not allowing any mutation after initial setting. So the setters should be removed and only Get methods should be left.

Copy link
Contributor Author

@rafaeldtinoco rafaeldtinoco Jul 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree BUT I don't know what will be mutable or not (yet) until the API server is designed and the extensions feature is finished. My syn here is to allow setters/getters for most of the fields and I'm not against disallowing that after the API server (and other features) are done. I needed something to start with and chose to allow everything first and reduce as appropriate (instead of the opposite).

update: check last comment on this thread.

Copy link
Contributor Author

@rafaeldtinoco rafaeldtinoco Jul 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About tailcalls being mutable or not: I can think of multiple situations where adding/removing indexes would be done during runtime (during a reconfiguration of policies and traced events).

And things such as (which I think you are referring):

image

become natural IF/WHEN we follow the encapsulation design pattern (which I think we should have done from the beginning as it allows us to protect data from inside the interface). Keeping local (from caller) slices/maps of info for definition updates brough us significant parallelism problems.

Like I said, until our reconfiguration capabilities is finished we don't know what should be instantiable and what shouldn't.

update: check last comment on this thread.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is also something else I just thought.. by having event instances we can now move things out of Tracee singleton, such as eventConfig. The "submit/emit" fields are global and should be atomic and concurrently updated if/when needed.

Your "hack" for the tailcalls was also because we didn't have the "dynamic nature" of a "event definition" well defined. In your case, we started having instances in the definition, in the eventConfig case we started keeping track of things externally to the interface.

If you note, we had a "mix" of mutable vs immutable fields because we thought the definition of event would be "always static". By making it not "static" we can encapsulate the interface, better control concurrency and have a single point of view for atomic settings.

The event definition would become (and is, already, partially is) "event definition and status". WDYT ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the same like Nadav. In my view, event definition should only reflect "pure definition" of the event, and be immutable. Only the fields that we consider "status" (not sure that we have any of these today) and perhaps fields that can potentially be user configuable in the future (e.g. metadata fields of the event) should have protection, and then we should consider moving them into another structure (which will be protected, of course).
I think that doing such a seperation between event definition, which is immutable, and fields that can be dynamically changed in a different struct will be a better abstraction, cause less potential concurrency and performance issues, and also let us postpone the decision about which fields should be dynamic.

With that said, I also think it is a good thing to make all the event definition fields private, and let the user access them only with getters like you did.

// EventGroup is a struct describing a collection of events.
type EventGroup struct {
events map[ID]*Event
mutex *sync.RWMutex
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a thought: Is there any reason that events should be added post tracee loading? We could potentially run lockless after starting tracee by not allowing any event definition loading after a certain point.
Suggesting this because we have some Getters used in the pipeline itself so they are hotpaths running RLock().

Copy link
Contributor Author

@rafaeldtinoco rafaeldtinoco Jul 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do share your concern on locking contention, but, at the same time, we first need to make sure we protect what needs protection.

I'm not against removing the locks but a RLock is literally translated into a single atomic read (if there wasn't a write the status is even cached).

About adding events during runtime: You started tracee with core events, now you will enable network events, and later you will enable plugin01 events. Or, you just loaded new signature (which are events now) runtime, etc, etc.

I cannot map everything we want dynamic (or not) without first implementing it and the only way I found to be achievable was by making sure we follow concurrency patterns in an adequate way first.

Update: also related: #3293 (comment)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just spoke to @geyslan about this. I'll provide a full study on lock contention for this change so there is no doubt about being safe to have or not (and that will allow me to continue with the rest without regrets if we can perform with "no significant, or none, impact".

id32Bit *atomic.Uint32
name string
docPath string
strMutex *sync.RWMutex
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

strMutex as in for name and docpath? Not sure these should be mutable at all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto (about now knowing what should or shouldn't be changed runtime). There are things that CLEARLY shouldn't: ksymbol name, it is literally the "handle" that describes a ksymbol dependency and won't change at all. But many of other fields are "debatable".

dependencies *atomic.Pointer[Dependencies]
sets map[string]struct{}
setsMutex *sync.RWMutex
params []trace.ArgMeta
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's take the opportunity to rename these to args (or whatever was decided in the new event format). As this is also a definition, not sure this should be dynamic (should new argument definitions be appendable post definition?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. I didn't think event arguments would be added/removed runtime for this work, it goes together with many other previous comment of first guaranteeing a stable interface for concurrency and later using it and refining.

@rafaeldtinoco
Copy link
Contributor Author

Main point I have is to consider if we actually need a lock on a lot of the fields we added them for, instead of just making them immutable by design.

Yep, I think you are right and I do have that in my mind. Unfortunately we don't know what will be mutable or not (yet ?) so I thought about making most of them capable of being set/got (and we can always make them immutable later). This is not a big change (to allow/disallow them to mutate). As long as they're part of the "instance" definition (so what is mutable is always on the same state during the program execution).

Make sure all types are thread-safe (for dynamic changes from API server).
- Will allow dynamic reconfiguration during runtime (API server).

- Will allow the Tracee Extensions feature:

  - Map Group
  - Probe Group (PR: #3223)
  - Functions
  - Event Group (this PR)
  - Signature(s)

related: #3170
Copy link
Collaborator

@yanivagman yanivagman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice change! I like the encapsulation done for the different structs.
Some comments follow

expectedEvents: []events.Event{
events.NewEventDefinition("fake_event_0", []string{"signatures", "default"}, []events.ID{events.HookedSyscalls}),
expectedEvents: []*events.Event{
events.NewEvent(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using NewEvent instead of NewEventDefinition may be misleading.
We don't create a new event here ("sourcing" the event) but defining a new event.
If you think definition is not the correct word to use then we can change (e.g. to NewEventConfiguration or similar) but I think NewEvent by itself is not enough

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree and was expecting the definitions to be discussed and done. EventDefinition is good enough for me, I was just worried because we use event := NewEvtDefinition() in many places, its misleading there as well (to understand if its a definition or a Tracee.Event).

I'll try to change this to make it more clear.

Comment on lines +32 to +47
events.ID(6001), // id,
events.Sys32Undefined, // id32
"fake_event_0", // eventName
"", // docPath
false, // internal
false, // syscall
[]string{"signatures", "default"}, // sets
events.NewDependencies(
[]events.ID{events.HookedSyscalls}, // ids
nil, // probes
nil, // ksyms
nil, // tailcalls
nil, // capabilities
),
[]trace.ArgMeta{},
),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be better to pass this as a struct and not with arguments to a function.
This way, you can also used named parameters, e.g.:
eventName: fake_event_0

events.NewEventDefinition("fake_event_1", []string{"signatures", "default"}, []events.ID{events.Ptrace}),
events.NewEventDefinition("fake_event_2", []string{"signatures", "default"}, []events.ID{events.SecurityFileOpen, events.SecurityInodeRename}),
expectedEvents: []*events.Event{
events.NewEvent(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't find the definition of NewEvent in this commit. Is it a new function? If so, it needs to be defined in this commit and not in the next one (or maybe this commit should come after the other one?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commits were inverted due to rebasing needs (likely). I can revert the commits and make tests come after.

Comment on lines -31 to +33
var err error
var output flags.PrepareOutputResult

output, err = flags.TraceeEbpfPrepareOutput(c.StringSlice("output"), false)
output, err := flags.TraceeEbpfPrepareOutput(c.StringSlice("output"), false)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can remove the declaration of the output variable as well

Comment on lines +188 to +195
eDependencies := evtDef.GetDependencies()
if eDependencies == nil {
return errfmt.Errorf("event id %d has nil dependencies", eventId)
}
evtDepEvtsIDs := eDependencies.GetEvents()
if evtDepEvtsIDs == nil {
return errfmt.Errorf("event id %d has nil event dependencies", eventId)
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it valid that event's dependencies will be nil?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that all dep_xxx.go files can go into dep.go.
This split creates too many files IMO which are not really necessary

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had same doubt. It got big, but I guess its ok since its all dependencies.


func makeDeriveBase(eventID events.ID) deriveBase {
def := getEventDefinition(eventID)
if events.Core == nil {
fmt.Printf("event definitions not found\n")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should use the logger instead of printf.
This comment applies to all places where printf is used in this PR

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, forgot about this (to convert them). Will do for sure.

//

// Event is a struct describing an event configuration.
type Event struct {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider renaming to EventDefinition

Comment on lines +17 to +18
id *atomic.Uint32
id32Bit *atomic.Uint32
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see a reason for the atomicity of these fields or any of the others (see my previous comment about event defintion immutability

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered everything to be thread-safe. I don't know when events would be added, by how many threads or how they would create them (today we create Signature Events, etc...). But I got what you are saying.

Comment on lines +101 to +102
e.mutex.RLock()
defer e.mutex.RUnlock()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mutex has any meaning here?
Unless used with another operation, I don't think it has

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a single mutex for any map operation. Reading the length of a map isn't thread-safe (nor atomic) in Golang and, in order for this function to be thread-safe you need to read-lock the entire map, the way I see.

@rafaeldtinoco
Copy link
Contributor Author

I think the same like Nadav. In my view, event definition should only reflect "pure definition" of the event, and be immutable. Only the fields that we consider "status" (not sure that we have any of these today) and perhaps fields that can potentially be user configuable in the future (e.g. metadata fields of the event) should have protection, and then we should consider moving them into another structure (which will be protected, of course).

Okay, Event Definition in a declarative way, and immutable.

I think that doing such a seperation between event definition, which is immutable, and fields that can be dynamically changed in a different struct will be a better abstraction, cause less potential concurrency and performance issues, and also let us postpone the decision about which fields should be dynamic.

Ok, I'll wrap a event definition into a "events used" "state pattern design" (that is what you two are requesting basically), where I can save states of immutable objects and control thread-concurrency only for the mutable states.

With that said, I also think it is a good thing to make all the event definition fields private, and let the user access them only with getters like you did.

Thanks, no need for setters on immutable fields though.

I'll close this PR now and come up with something else then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants